Goto

Collaborating Authors

 Santander Department


Sparse Network Inference under Imperfect Detection and its Application to Ecological Networks

arXiv.org Machine Learning

Abstract--Recovering latent structure from count data has received considerable attention in network inference, particularly when one seeks both cross-group interactions and within-group similarity patterns in bipartite networks, which is widely used in ecology research. Such networks are often sparse and inherently imperfect in their detection. Existing models mainly focus on interaction recovery, while the induced similarity graphs are much less studied. Moreover, sparsity is often not controlled, and scale is unbalanced, leading to oversparse or poorly rescaled estimates with degrading structural recovery. We impose nonconvex ℓ1/2 regularization on the latent similarity and connectivity structures to promote sparsity within-group similarity and cross-group connectivity with better relative scale. To solve it, we develop an ADMM-based algorithm with adaptive penalization and scale-aware initialization and establish its asymptotic feasibility and KKT stationarity of cluster points under mild regularity conditions. Experiments on synthetic and real-world ecological datasets demonstrate improved recovery of latent factors and similarity/connectivity structure relative to existing baselines. Index Terms--augmented Lagrangian, nonconvex nonsmooth optimization, nonnegative matrix factorization, link prediction, ecological network inference, structured sparse recovery I. INTRODUCTION This setting is inherent in sensing and monitoring applications [3], [4], where observations, such as counts, are obtained via an imperfect sampling process. In this paper, we are interested in ecological interaction networks describing how species associate with locations and how environments shape biodiversity patterns [5], [6].


Learning to Recorrupt: Noise Distribution Agnostic Self-Supervised Image Denoising

arXiv.org Machine Learning

Self-supervised image denoising methods have traditionally relied on either architectural constraints or specialized loss functions that require prior knowledge of the noise distribution to avoid the trivial identity mapping. Among these, approaches such as Noisier2Noise or Recorrupted2Recorrupted, create training pairs by adding synthetic noise to the noisy images. While effective, these recorruption-based approaches require precise knowledge of the noise distribution, which is often not available. We present Learning to Recorrupt (L2R), a noise distribution-agnostic denoising technique that eliminates the need for knowledge of the noise distribution. Our method introduces a learnable monotonic neural network that learns the recorruption process through a min-max saddle-point objective. The proposed method achieves state-of-the-art performance across unconventional and heavy-tailed noise distributions, such as log-gamma, Laplace, and spatially correlated noise, as well as signal-dependent noise models such as Poisson-Gaussian noise.


Long-form factuality in large language models Jerry Wei 1 Chengrun Y ang 1 Xinying Song 1 Yifeng Lu

Neural Information Processing Systems

To benchmark a model's long-form factuality in open domains, we first use GPT -4 to generate LongFact, a prompt set comprising thousands of questions spanning 38 topics. We then propose that LLM agents can be used as automated evaluators for long-form factuality through a method which we call Search-Augmented Factuality Evaluator (SAFE).



A multitask transformer to sign language translation using motion gesture primitives

arXiv.org Artificial Intelligence

The absence of effective communication the deaf population represents the main social gap in this community. Furthermore, the sign language, main deaf communication tool, is unlettered, i.e., there is no formal written representation. In consequence, main challenge today is the automatic translation among spatiotemporal sign representation and natural text language. Recent approaches are based on encoder-decoder architectures, where the most relevant strategies integrate attention modules to enhance non-linear correspondences, besides, many of these approximations require complex training and architectural schemes to achieve reasonable predictions, because of the absence of intermediate text projections. However, they are still limited by the redundant background information of the video sequences. This work introduces a multitask transformer architecture that includes a gloss learning representation to achieve a more suitable translation. The proposed approach also includes a dense motion representation that enhances gestures and includes kinematic information, a key component in sign language. From this representation it is possible to avoid background information and exploit the geometry of the signs, in addition, it includes spatiotemporal representations that facilitate the alignment between gestures and glosses as an intermediate textual representation. Keywords: Sign language translation, gloss, transformer, deep learning representations 2010 MSC: 00-01, 99-00 1. Introduction Approximately 1 .5 billion people have some associated degree of hearing loss worldwide. These languages are composed of visio-spatial gestural movements and expressions, together with complex manual and non-manual interactions. Today there are more than 150 official SLs with multiple variations in each country. Like any language, there is an intrinsic grammatical richness with multiple gestural and expressive variations. These aspects make the modeling of SLs a very challenging task, even for the most advanced computer vision and representation learning methodologies. In fact, signs do not have a direct written representation, which makes it more difficult to structure the language, implying major challenges to find correspondence with other textual languages.


Development of a Deep Learning Model for the Prediction of Ventilator Weaning

arXiv.org Artificial Intelligence

The issue of failed weaning is a critical concern in the intensive care unit (ICU) setting. This scenario occurs when a patient experiences difficulty maintaining spontaneous breathing and ensuring a patent airway within the first 48 hours after the withdrawal of mechanical ventilation. Approximately 20 of ICU patients experience this phenomenon, which has severe repercussions on their health. It also has a substantial impact on clinical evolution and mortality, which can increase by 25 to 50. To address this issue, we propose a medical support system that uses a convolutional neural network (CNN) to assess a patients suitability for disconnection from a mechanical ventilator after a spontaneous breathing test (SBT). During SBT, respiratory flow and electrocardiographic activity were recorded and after processed using time-frequency analysis (TFA) techniques. Two CNN architectures were evaluated in this study: one based on ResNet50, with parameters tuned using a Bayesian optimization algorithm, and another CNN designed from scratch, with its structure also adapted using a Bayesian optimization algorithm. The WEANDB database was used to train and evaluate both models. The results showed remarkable performance, with an average accuracy 98 when using CNN from scratch. This model has significant implications for the ICU because it provides a reliable tool to enhance patient care by assisting clinicians in making timely and accurate decisions regarding weaning. This can potentially reduce the adverse outcomes associated with failed weaning events.


Diagnosis of Patients with Viral, Bacterial, and Non-Pneumonia Based on Chest X-Ray Images Using Convolutional Neural Networks

arXiv.org Artificial Intelligence

According to the World Health Organization (WHO), pneumonia is a disease that causes a significant number of deaths each year. In response to this issue, the development of a decision support system for the classification of patients into those without pneumonia and those with viral or bacterial pneumonia is proposed. This is achieved by implementing transfer learning (TL) using pre-trained convolutional neural network (CNN) models on chest x-ray (CXR) images. The system is further enhanced by integrating Relief and Chi-square methods as dimensionality reduction techniques, along with support vector machines (SVM) for classification. The performance of a series of experiments was evaluated to build a model capable of distinguishing between patients without pneumonia and those with viral or bacterial pneumonia. The obtained results include an accuracy of 91.02%, precision of 97.73%, recall of 98.03%, and an F1 Score of 97.88% for discriminating between patients without pneumonia and those with pneumonia. In addition, accuracy of 93.66%, precision of 94.26%, recall of 92.66%, and an F1 Score of 93.45% were achieved for discriminating between patients with viral pneumonia and those with bacterial pneumonia.


Spatio-temporal transformer to support automatic sign language translation

arXiv.org Artificial Intelligence

Sign Language Translation (SLT) systems support hearing-impaired people communication by finding equivalences between signed and spoken languages. This task is however challenging due to multiple sign variations, complexity in language and inherent richness of expressions. Computational approaches have evidenced capabilities to support SLT. Nonetheless, these approaches remain limited to cover gestures variability and support long sequence translations. This paper introduces a Transformer-based architecture that encodes spatio-temporal motion gestures, preserving both local and long-range spatial information through the use of multiple convolutional and attention mechanisms. The proposed approach was validated on the Colombian Sign Language Translation Dataset (CoL-SLTD) outperforming baseline approaches, and achieving a BLEU4 of 46.84%. Additionally, the proposed approach was validated on the RWTH-PHOENIX-Weather-2014T (PHOENIX14T), achieving a BLEU4 score of 30.77%, demonstrating its robustness and effectiveness in handling real-world variations


Generalized Recorrupted-to-Recorrupted: Self-Supervised Learning Beyond Gaussian Noise

arXiv.org Machine Learning

Recorrupted-to-Recorrupted (R2R) has emerged as a methodology for training deep networks for image restoration in a self-supervised manner from noisy measurement data alone, demonstrating equivalence in expectation to the supervised squared loss in the case of Gaussian noise. However, its effectiveness with non-Gaussian noise remains unexplored. In this paper, we propose Generalized R2R (GR2R), extending the R2R framework to handle a broader class of noise distribution as additive noise like log-Rayleigh and address the natural exponential family including Poisson and Gamma noise distributions, which play a key role in many applications including low-photon imaging and synthetic aperture radar. We show that the GR2R loss is an unbiased estimator of the supervised loss and that the popular Stein's unbiased risk estimator can be seen as a special case. A series of experiments with Gaussian, Poisson, and Gamma noise validate GR2R's performance, showing its effectiveness compared to other self-supervised methods.


A multi-criteria decision support system to evaluate the effectiveness of training courses on citizens' employability

arXiv.org Artificial Intelligence

This study examines the impact of lifelong learning on the professional lives of employed and unemployed individuals. Lifelong learning is a crucial factor in securing employment or enhancing one's existing career prospects. To achieve this objective, this study proposes the implementation of a multi-criteria decision support system for the evaluation of training courses in accordance with their capacity to enhance the employability of the students. The methodology is delineated in four stages. Firstly, a `working life curve' was defined to provide a quantitative description of an individual's working life. Secondly, an analysis based on K-medoids clustering defined a control group for each individual for comparison. Thirdly, the performance of a course according to each of the four predefined criteria was calculated using a t-test to determine the mean performance value of those who took the course. Ultimately, the unweighted TOPSIS method was used to evaluate the efficacy of the various training courses in relation to the four criteria. This approach effectively addresses the challenge of using extensive datasets within a system while facilitating the application of a multi-criteria unweighted TOPSIS method. The results of the multi-criteria TOPSIS method indicated that training courses related to the professional fields of administration and management, hostel and tourism and community and sociocultural services have positive impact on employability and improving the working conditions of citizens. However, courses that demonstrate the greatest effectiveness in ranking are the least demanded by citizens. The results will help policymakers evaluate the effectiveness of each training course offered by the regional government.